-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forward-merge branch-23.12 to branch-24.02 #14422
Merged
raydouglass
merged 18 commits into
rapidsai:branch-24.02
from
bdice:branch-24.02-merge-23.12
Nov 16, 2023
Merged
Forward-merge branch-23.12 to branch-24.02 #14422
raydouglass
merged 18 commits into
rapidsai:branch-24.02
from
bdice:branch-24.02-merge-23.12
Nov 16, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Update the nvCOMP version used for cuIO compression/decompression to 3.0.4. Authors: - Vukasin Milovanovic (https://github.com/vuule) - Bradley Dice (https://github.com/bdice) Approvers: - Bradley Dice (https://github.com/bdice) - Ray Douglass (https://github.com/raydouglass) URL: rapidsai#13815
All of these wrappers have now been upstreamed into Cython as of Cython 3.0.3. Contributes to rapidsai#14023 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Bradley Dice (https://github.com/bdice) - Jake Awe (https://github.com/AyodeAwe) URL: rapidsai#14382
Creates a normalizing offsets iterator that returns an int64 value given either a int32 or int64 column data. Depends on rapidsai#14206 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Divye Gala (https://github.com/divyegala) - Yunsong Wang (https://github.com/PointKernel) URL: rapidsai#14234
…rapidsai#14364) * Update dependency lists * Update wheel building to stop needing manual installations * Update wheel dependency with alpha spec * Rename the package * Update update-version.sh * Update conda/recipes/dask-cudf/meta.yaml Co-authored-by: GALI PREM SAGAR <[email protected]> * Make pip/conda dependencies consistent and fix recipe * dfg * Apply suggestions from code review --------- Co-authored-by: GALI PREM SAGAR <[email protected]>
…sai#14399) Corrects failures seen in C++ CI where libnvbench.so can't be found Authors: - Robert Maynard (https://github.com/robertmaynard) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) URL: rapidsai#14399
…14388) Closes rapidsai#14384. `x.startswith(y)` is not a good enough check for if `x` is a subdirectory of `y`. It causes `pandasai` to be reported as a sub-package of `pandas`. Authors: - Ashwin Srinath (https://github.com/shwina) Approvers: - https://github.com/brandon-b-miller URL: rapidsai#14388
Refactor the currently outdated cudf_kafka build setup to use skbuild instead. Authors: - Jeremy Dyer (https://github.com/jdye64) - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) - AJ Schmidt (https://github.com/ajschmidt8) URL: rapidsai#14292
Adds a new BytePairEncoding class to cuDF ``` >>> import cudf >>> from cudf.core.byte_pair_encoding import BytePairEncoder >>> mps = cudf.read_text('merges.txt', delimiter='\n', strip_delimiters=True) >>> bpe = BytePairEncoder(mps) >>> str_series = cudf.Series(['This is a sentence', 'thisisit']) >>> bpe(str_series) 0 This is a sent ence 1 this is it dtype: object ``` This class wraps the existing `nvtext::byte_pair_encoding` APIs to load the merge-pairs data and encode a column of strings. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#13891
…4393) Fixes a bug introduced in rapidsai#14336 when trying to simplify the token-counting logic as per this discussion rapidsai#14336 (comment) The simplification caused an error which was found when running the nvtext benchmarks. The appropriate gtest has been updated to cover this case now. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Karthikeyan (https://github.com/karthikeyann) URL: rapidsai#14393
This PR switches remaining usages of `dask` dependencies to use `rapids-dask-dependency` Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Bradley Dice (https://github.com/bdice) - Jake Awe (https://github.com/AyodeAwe) - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#14407
This PR contributes to rapidsai#13744. -Added stream parameters to public APIs `cudf::io::read_csv` `cudf::io::write_csv` -Added stream gtests Authors: - https://github.com/shrshi - Karthikeyan (https://github.com/karthikeyann) Approvers: - Karthikeyan (https://github.com/karthikeyann) - Vukasin Milovanovic (https://github.com/vuule) - Yunsong Wang (https://github.com/PointKernel) URL: rapidsai#14340
…ai#14411) Port NVIDIA/nvbench#148 to cudf so that nvbench benchmarks work now that we always use a static version of nvbench. Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#14411
…apidsai#14390) Noticed this while trying to clean up `as_column` Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#14390
…idsai#14367) Issue rapidsai#14325 Use uint when reading/writing nano stats because nanoseconds have int32 encoding (different from both unit32 and sint32, _obviously_), which does not use zigzag. sint32 uses zigzag, and unit32 does not allow negative numbers, so we can use uint since we'll never have negative nanoseconds. Also disabled the nanoseconds because it should only be written after ORC-135; we don't write the version so readers get confused if nanoseconds are there. Planning to re-enable once we start writing the version. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Nghia Truong (https://github.com/ttnghia) URL: rapidsai#14367
Fixes: rapidsai#14398 This PR raises an error in `reindex` API when reindexing is performed on a non-unique index column. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Matthew Roeschke (https://github.com/mroeschke) - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#14400
rapidsai#14407 added a dask dependency to custreamz, but it added too tight of a pinning by requiring the exact same version. This is not valid because rapids-dask-dependency won't release a new version corresponding to each new cudf release, so pinning to the exact same version up to the alpha creates an unsatisfiable constraint. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Ray Douglass (https://github.com/raydouglass) - Bradley Dice (https://github.com/bdice) - GALI PREM SAGAR (https://github.com/galipremsagar)
bdice
requested review from
mroeschke,
galipremsagar,
harrism and
nvdbaranec
November 16, 2023 00:21
github-actions
bot
added
libcudf
Affects libcudf (C++/CUDA) code.
Python
Affects Python cuDF API.
labels
Nov 16, 2023
3 tasks
Adding |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CMake
CMake build issue
improvement
Improvement / enhancement to an existing function
libcudf
Affects libcudf (C++/CUDA) code.
non-breaking
Non-breaking change
Python
Affects Python cuDF API.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Manual forward merge from 23.12 to 24.02. This PR should not be squashed. Closes #14406.